Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Donot self-allocate IDs in Beam by default. #1809

Merged
merged 4 commits into from
Oct 20, 2022

Conversation

jianglai
Copy link
Collaborator

@jianglai jianglai commented Oct 5, 2022

Per b/250948425, it is dangerous to implicitly allow all Beam pipelines
to create buildables by self allocating the IDs. This change makes it so
that one has to explicitly use self allocated IDs in Beam.

A boolean is added to the pipeline option so that it can be passed to
the beam worker initializer that controls the behavior of the JVM on
each worker. Note that we did not add the option in the metadata.json file
because we did not want people to use the override at run time when launching
a pipeline, due to the risk. As shown in RdePipeline.java, we instead
explicitly hard-code the option in the pipeline. There is nothing that
stops one to supply that option when launching the pipeline, but it's
not advised.

Tested=deployed the pipeline alpha and ran it.


This change is Reviewable

@jianglai jianglai added the WIP Work in progress. Don't review yet. label Oct 5, 2022
@jianglai jianglai removed the WIP Work in progress. Don't review yet. label Oct 5, 2022
@jianglai jianglai requested review from CydeWeys and weiminyu October 5, 2022 22:38
Copy link
Collaborator

@weiminyu weiminyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 4 files reviewed, 1 unresolved discussion (waiting on @CydeWeys and @jianglai)


core/src/main/java/google/registry/beam/common/RegistryPipelineWorkerInitializer.java line 78 at r2 (raw file):

    // as you can build the entities.
    if (registryOptions.getUseSelfAllocatedId()) {
      IdService.useSelfAllocatedId();

Instead of using a flag, may be we can provide a bring-your-own-id-service api.

Open up IdService to accept user provided Supplier if the environment is BEAM,
and do not provide a default implementation (throw is on beam and supplier is not provided.).

Code quote:

    if (registryOptions.getUseSelfAllocatedId()) {
      IdService.useSelfAllocatedId();
    }

Copy link
Collaborator

@weiminyu weiminyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 4 files reviewed, 1 unresolved discussion (waiting on @CydeWeys and @jianglai)


core/src/main/java/google/registry/beam/common/RegistryPipelineWorkerInitializer.java line 78 at r2 (raw file):

Previously, weiminyu (Weimin Yu) wrote…

Instead of using a flag, may be we can provide a bring-your-own-id-service api.

Open up IdService to accept user provided Supplier if the environment is BEAM,
and do not provide a default implementation (throw is on beam and supplier is not provided.).

This way we can plug in a raw datastore api to allocate ids.

Copy link
Member

@CydeWeys CydeWeys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 4 files reviewed, 2 unresolved discussions (waiting on @CydeWeys, @jianglai, and @weiminyu)


core/src/main/java/google/registry/model/IdService.java line 67 at r2 (raw file):

   * @see #isSelfAllocated
   */
  public static void useSelfAllocatedId() {

Can we use different verbiage for this? E.g. "fake id", "testing id", "fake testing id" or similar? I don't think "self-allocated ID" conveys the right meaning, namely, that this is horribly unsafe for anything in production.

@jianglai jianglai force-pushed the allocate-id branch 5 times, most recently from 8082cc5 to 733fc8a Compare October 17, 2022 15:44
@jianglai jianglai added the kokoro:force-run Force a Kokoro build. label Oct 17, 2022
@domain-registry-eng domain-registry-eng removed the kokoro:force-run Force a Kokoro build. label Oct 17, 2022
@jianglai jianglai requested review from CydeWeys and weiminyu October 17, 2022 17:47
Copy link
Collaborator Author

@jianglai jianglai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTAL.

Reviewed 4 of 4 files at r1, 2 of 2 files at r2, 3 of 3 files at r3, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @CydeWeys and @weiminyu)


core/src/main/java/google/registry/beam/common/RegistryPipelineWorkerInitializer.java line 78 at r2 (raw file):

Previously, weiminyu (Weimin Yu) wrote…

This way we can plug in a raw datastore api to allocate ids.

Done.


core/src/main/java/google/registry/model/IdService.java line 67 at r2 (raw file):

Previously, CydeWeys (Ben McIlwain) wrote…

Can we use different verbiage for this? E.g. "fake id", "testing id", "fake testing id" or similar? I don't think "self-allocated ID" conveys the right meaning, namely, that this is horribly unsafe for anything in production.

Reworked the class to take a ID supplier per Weimin's suggestion. I think the supplier used in test can still be called SelfAllocatedIdSupplier as it is an accurate description of what it does, whereas a FakeIdSupplier does not convey exactly what kind of "fake" it is, e.g. is it a constant (which should probably better named ConstantIdSupplier), or random (RandomIdSupplier).

@weiminyu
Copy link
Collaborator

core/src/main/java/google/registry/model/IdService.java line 48 at r3 (raw file):

  private static Supplier<Long> idSupplier =
      RegistryEnvironment.UNITTEST.equals(RegistryEnvironment.get())
          ? SelfAllocatedIdSupplier.getInstance()

Building on Ben's comment, maybe we should call this class NonUniqueIdSupplier, and
make the useSelfAllocatedId BEAM flag a Enum with NonUniqueId, OfyId, and maybe RawDatastoreId if it comes to that.

Code quote:

SelfAllocatedIdSupplier

Copy link
Collaborator Author

@jianglai jianglai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @CydeWeys and @weiminyu)


core/src/main/java/google/registry/model/IdService.java line 48 at r3 (raw file):

Previously, weiminyu (Weimin Yu) wrote…

Building on Ben's comment, maybe we should call this class NonUniqueIdSupplier, and
make the useSelfAllocatedId BEAM flag a Enum with NonUniqueId, OfyId, and maybe RawDatastoreId if it comes to that.

Just to clarify, do you mean to have a lookup map from the Beam enums to ID suppliers that one use to determine the override?

Copy link
Collaborator Author

@jianglai jianglai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @CydeWeys and @weiminyu)


core/src/main/java/google/registry/model/IdService.java line 48 at r3 (raw file):

Previously, jianglai (Lai Jiang) wrote…

Just to clarify, do you mean to have a lookup map from the Beam enums to ID suppliers that one use to determine the override?

Also, I think the current iteration is explicit enough to discourage the use of ID supplier overrides. It also allows for the further migration to a SQL-provided ID (as another ID provider), by which time we can simply remove the setter, as Beam has access to SQL. It doesn't seem worth it to make the Beam flags more flexible as it would just be throwaway work.

Copy link
Member

@CydeWeys CydeWeys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @weiminyu)

Per b/250948425, it is dangerous to implicitly allow all Beam pipelines
to create buildables by self allocating the IDs. This change makes it so
that one has to explicitly request self allocation in Beam.

The boolean is added to the pipeline option so that it can be passed to
the beam worker initializer that controls the behavior of the JVM on
each worker. Note that we did not add the option in the metadata.json file
because we did not want people to use the override at run time when launching
a pipeline, due to the risk. As shown in RdePipeline.java, we instead
explicitly hard-code the option in the pipeline. There is nothing that
stops one to supply that option when launching the pipeline, but it's
not advised.

Tested=deployed the pipeline alpha and ran it.
Copy link
Collaborator Author

@jianglai jianglai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 2 files at r5, 1 of 1 files at r6, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @weiminyu)

@jianglai jianglai enabled auto-merge (squash) October 19, 2022 20:04
@jianglai jianglai changed the title Does not self allocate IDs in Beam by default. Donot self-allocate IDs in Beam by default. Oct 20, 2022
@jianglai jianglai dismissed weiminyu’s stale review October 20, 2022 00:44

Approval obtained.

@jianglai jianglai merged commit addef17 into google:master Oct 20, 2022
@jianglai jianglai deleted the allocate-id branch October 20, 2022 00:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants